Multi-document summarization using A* search and discriminative training
نویسندگان
چکیده
In this paper we address two key challenges for extractive multi-document summarization: the search problem of finding the best scoring summary and the training problem of learning the best model parameters. We propose an A* search algorithm to find the best extractive summary up to a given length, which is both optimal and efficient to run. Further, we propose a discriminative training algorithm which directly maximises the quality of the best summary, rather than assuming a sentence-level decomposition as in earlier work. Our approach leads to significantly better results than earlier techniques across a number of evaluation metrics.
منابع مشابه
Multi-Document Summarization Using A* Search and Discriminative Learning
In this paper we address two key challenges for extractive multi-document summarization: the search problem of finding the best scoring summary and the training problem of learning the best model parameters. We propose an A* search algorithm to find the best extractive summary up to a given length, which is both optimal and efficient to run. Further, we propose a discriminative training algorit...
متن کاملText Summarization Using Cuckoo Search Optimization Algorithm
Today, with rapid growth of the World Wide Web and creation of Internet sites and online text resources, text summarization issue is highly attended by various researchers. Extractive-based text summarization is an important summarization method which is included of selecting the top representative sentences from the input document. When, we are facing into large data volume documents, the extr...
متن کاملExtractive summarization using a latent variable model
Extractive multi-document summarization is the task of choosing sentences from a set of documents to compose a summary text in response to a user query. We propose a generative approach to explicitly identify summary and non-summary topic distributions in the sentences of a given set of documents (i.e., document cluster). Using these approximate summary topic probabilities as latent output vari...
متن کاملImproving Multi-documents Summarization by Sentence Compression based on Expanded Constituent Parse Trees
In this paper, we focus on the problem of using sentence compression techniques to improve multi-document summarization. We propose an innovative sentence compression method by considering every node in the constituent parse tree and deciding its status – remove or retain. Integer liner programming with discriminative training is used to solve the problem. Under this model, we incorporate vario...
متن کاملMulti-Document Summarization via Discriminative Summary Reranking
Existing multi-document summarization systems usually rely on a specific summarization model (i.e., a summarization method with a specific parameter setting) to extract summaries for different document sets with different topics. However, according to our quantitative analysis, none of the existing summarization models can always produce high-quality summaries for different document sets, and e...
متن کامل